65 research outputs found
What Makes a Place? Building Bespoke Place Dependent Object Detectors for Robotics
This paper is about enabling robots to improve their perceptual performance
through repeated use in their operating environment, creating local expert
detectors fitted to the places through which a robot moves. We leverage the
concept of 'experiences' in visual perception for robotics, accounting for bias
in the data a robot sees by fitting object detector models to a particular
place. The key question we seek to answer in this paper is simply: how do we
define a place? We build bespoke pedestrian detector models for autonomous
driving, highlighting the necessary trade off between generalisation and model
capacity as we vary the extent of the place we fit to. We demonstrate a
sizeable performance gain over a current state-of-the-art detector when using
computationally lightweight bespoke place-fitted detector models.Comment: IROS 201
Dropout Distillation for Efficiently Estimating Model Confidence
We propose an efficient way to output better calibrated uncertainty scores
from neural networks. The Distilled Dropout Network (DDN) makes standard
(non-Bayesian) neural networks more introspective by adding a new training loss
which prevents them from being overconfident. Our method is more efficient than
Bayesian neural networks or model ensembles which, despite providing more
reliable uncertainty scores, are more cumbersome to train and slower to test.
We evaluate DDN on the the task of image classification on the CIFAR-10 dataset
and show that our calibration results are competitive even when compared to 100
Monte Carlo samples from a dropout network while they also increase the
classification accuracy. We also propose better calibration within the state of
the art Faster R-CNN object detection framework and show, using the COCO
dataset, that DDN helps train better calibrated object detectors
Incremental Adversarial Domain Adaptation for Continually Changing Environments
Continuous appearance shifts such as changes in weather and lighting
conditions can impact the performance of deployed machine learning models.
While unsupervised domain adaptation aims to address this challenge, current
approaches do not utilise the continuity of the occurring shifts. In
particular, many robotics applications exhibit these conditions and thus
facilitate the potential to incrementally adapt a learnt model over minor
shifts which integrate to massive differences over time. Our work presents an
adversarial approach for lifelong, incremental domain adaptation which benefits
from unsupervised alignment to a series of intermediate domains which
successively diverge from the labelled source domain. We empirically
demonstrate that our incremental approach improves handling of large appearance
changes, e.g. day to night, on a traversable-path segmentation task compared
with a direct, single alignment step approach. Furthermore, by approximating
the feature distribution for the source domain with a generative adversarial
network, the deployment module can be rendered fully independent of retaining
potentially large amounts of the related source training data for only a minor
reduction in performance.Comment: International Conference on Robotics and Automation 201
Simple Online and Realtime Tracking with a Deep Association Metric
Simple Online and Realtime Tracking (SORT) is a pragmatic approach to
multiple object tracking with a focus on simple, effective algorithms. In this
paper, we integrate appearance information to improve the performance of SORT.
Due to this extension we are able to track objects through longer periods of
occlusions, effectively reducing the number of identity switches. In spirit of
the original framework we place much of the computational complexity into an
offline pre-training stage where we learn a deep association metric on a
large-scale person re-identification dataset. During online application, we
establish measurement-to-track associations using nearest neighbor queries in
visual appearance space. Experimental evaluation shows that our extensions
reduce the number of identity switches by 45%, achieving overall competitive
performance at high frame rates.Comment: 5 pages, 1 figur
Addressing Appearance Change in Outdoor Robotics with Adversarial Domain Adaptation
Appearance changes due to weather and seasonal conditions represent a strong
impediment to the robust implementation of machine learning systems in outdoor
robotics. While supervised learning optimises a model for the training domain,
it will deliver degraded performance in application domains that underlie
distributional shifts caused by these changes. Traditionally, this problem has
been addressed via the collection of labelled data in multiple domains or by
imposing priors on the type of shift between both domains. We frame the problem
in the context of unsupervised domain adaptation and develop a framework for
applying adversarial techniques to adapt popular, state-of-the-art network
architectures with the additional objective to align features across domains.
Moreover, as adversarial training is notoriously unstable, we first perform an
extensive ablation study, adapting many techniques known to stabilise
generative adversarial networks, and evaluate on a surrogate classification
task with the same appearance change. The distilled insights are applied to the
problem of free-space segmentation for motion planning in autonomous driving.Comment: In Proceedings of the 2017 IEEE/RSJ International Conference on
Intelligent Robots and Systems (IROS 2017
Meshed Up: Learnt Error Correction in 3D Reconstructions
Dense reconstructions often contain errors that prior work has so far
minimised using high quality sensors and regularising the output. Nevertheless,
errors still persist. This paper proposes a machine learning technique to
identify errors in three dimensional (3D) meshes. Beyond simply identifying
errors, our method quantifies both the magnitude and the direction of depth
estimate errors when viewing the scene. This enables us to improve the
reconstruction accuracy.
We train a suitably deep network architecture with two 3D meshes: a
high-quality laser reconstruction, and a lower quality stereo image
reconstruction. The network predicts the amount of error in the lower quality
reconstruction with respect to the high-quality one, having only view the
former through its input. We evaluate our approach by correcting
two-dimensional (2D) inverse-depth images extracted from the 3D model, and show
that our method improves the quality of these depth reconstructions by up to a
relative 10% RMSE.Comment: Accepted for the International Conference on Robotics and Automation
(ICRA) 201
Development of a dragline in-bucket bulk density monitor
This paper details the implementation and trialling of a prototype in-bucket bulk density monitor on a production dragline. Bulk density information can provide feedback to mine planning and scheduling to improve blasting and consequently facilitating optimal bucket sizing. The bulk density measurement builds upon outcomes presented in the AMTC2009 paper titled ‘Automatic In-Bucket Volume Estimation for Dragline Operations’ and utilises payload information from a commercial dragline monitor. While the previous paper explains the algorithms and theoretical basis for the system design and scaled model testing this paper will focus on the full scale implementation and the challenges involved
Scrutinizing and De-Biasing Intuitive Physics with Neural Stethoscopes
Visually predicting the stability of block towers is a popular task in the
domain of intuitive physics. While previous work focusses on prediction
accuracy, a one-dimensional performance measure, we provide a broader analysis
of the learned physical understanding of the final model and how the learning
process can be guided. To this end, we introduce neural stethoscopes as a
general purpose framework for quantifying the degree of importance of specific
factors of influence in deep neural networks as well as for actively promoting
and suppressing information as appropriate. In doing so, we unify concepts from
multitask learning as well as training with auxiliary and adversarial losses.
We apply neural stethoscopes to analyse the state-of-the-art neural network for
stability prediction. We show that the baseline model is susceptible to being
misled by incorrect visual cues. This leads to a performance breakdown to the
level of random guessing when training on scenarios where visual cues are
inversely correlated with stability. Using stethoscopes to promote meaningful
feature extraction increases performance from 51% to 90% prediction accuracy.
Conversely, training on an easy dataset where visual cues are positively
correlated with stability, the baseline model learns a bias leading to poor
performance on a harder dataset. Using an adversarial stethoscope, the network
is successfully de-biased, leading to a performance increase from 66% to 88%
- …